feat: make async in-flight task cap configurable#699
Conversation
Review: PR #699 — feat: make async task cap configurableSummaryPromotes the previously hard-coded async scheduler submission cap to a public FindingsCorrectness — looks right
Behavior change worth calling outThe default jumps from 256 → 1024 (4×). Users who never set this knob will see significantly more concurrent task leases. The PR notes this as intended (so model-heavy runs don't get throttled by the old floor), but consider:
Not blocking, but a release note / changelog entry beyond the PR body would help downstream users notice. Naming / minor consistency
Test coverage — solid
One gap: there's no test that exercises the floor behavior — i.e., a user setting Style / project conventions
Security / performance
VerdictApprove-with-nits. The wiring is clean, tests cover the happy path and validation, and the layering respects the config → engine direction. Suggested non-blocking follow-ups:
|
Greptile SummaryThis PR exposes
|
| Filename | Overview |
|---|---|
| packages/data-designer-config/src/data_designer/config/run_config.py | Adds max_in_flight_tasks: int = Field(default=1024, ge=1) with docstring and Pydantic validation; straightforward field addition. |
| packages/data-designer-engine/src/data_designer/engine/dataset_builders/async_scheduler.py | Renames max_submitted_tasks → max_in_flight_tasks, removes DEFAULT_TASK_POOL_SIZE/MODEL_TASK_ADMISSION_HEADROOM_MULTIPLIER, simplifies queue guard to max_in_flight_tasks * 4, and removes unused 'local' key from resource_limits. |
| packages/data-designer-engine/src/data_designer/engine/dataset_builders/dataset_builder.py | Replaces the aggregate model-request-headroom multiplier with a direct read of run_config.max_in_flight_tasks; sets both max_in_flight_tasks and max_model_task_admission to the same value. |
| packages/data-designer-engine/src/data_designer/engine/dataset_builders/scheduling/task_admission.py | Extracts DEFAULT_IN_FLIGHT_TASK_CAPACITY = 1024 constant and updates TaskAdmissionConfig.submission_capacity default; no logic changes. |
| packages/data-designer-engine/tests/engine/dataset_builders/test_async_scheduler.py | Renames all max_submitted_tasks usages to max_in_flight_tasks and adds test_scheduler_adaptive_row_group_queue_guard_uses_in_flight_task_cap covering the new guard formula. |
| packages/data-designer-engine/tests/engine/dataset_builders/test_async_builder_integration.py | Updates integration test to assert max_in_flight_tasks and max_model_task_admission are both propagated from run_config, and guards against any future accidental call to get_aggregate_max_parallel_requests. |
| packages/data-designer/tests/interface/test_data_designer.py | Extends test_run_config_setting_persists to verify max_in_flight_tasks round-trips through all three set-config calls. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
RC[RunConfig\nmax_in_flight_tasks=1024] -->|read| DB[DatasetBuilder\n_prepare_async_run]
DB -->|max_in_flight_tasks\nmax_model_task_admission| ATS[AsyncTaskScheduler.__init__]
ATS -->|submission_capacity=max_in_flight_tasks\nresource_limits llm_wait=max_model_task_admission| TAC[TaskAdmissionConfig]
ATS -->|model_group_limit_cap=max_model_task_admission| TSR[TaskSchedulingResolver]
ATS -->|_max_in_flight_tasks| QG[Queue Guard\nmax_in_flight_tasks × 4]
TAC --> CTRL[TaskAdmissionController\nleases submission + llm_wait]
TSR -->|admitted_limit per model group| CTRL
QG -->|blocks new row-group admission| RGA[Adaptive Row-Group Admission]
Reviews (5): Last reviewed commit: "feat: make async in-flight task cap conf..." | Re-trigger Greptile
1b75986 to
5981281
Compare
1bcb049 to
5ae93df
Compare
nabinchha
left a comment
There was a problem hiding this comment.
Thanks for putting this together, @eric-tramel — clean, well-scoped change.
Summary
Adds RunConfig.max_in_flight_tasks (default 1024, ge=1) so users can tune the async scheduler's task-lease capacity, raises the default from 256 to 1024, and wires the value end-to-end through DatasetBuilder into AsyncTaskScheduler (including the model-stage admission floor and capacity reporting). The implementation matches the stated intent in the PR description and keeps the engine/config layering intact — config grows a new field, engine consumes it.
Findings
Warnings — Worth addressing
packages/data-designer-engine/src/data_designer/engine/dataset_builders/dataset_builder.py:1064-1068 — max_model_task_admission floor can surprise users tuning down
- What: When a user explicitly lowers
max_in_flight_tasks(e.g.64), the floor still pinsmax_model_task_admissiontomax(DEFAULT_TASK_POOL_SIZE=1024, 64, 2 * aggregate)=1024. So thellm_waitresource keeps a 1024-slot ceiling even though the user asked for a much smaller cap. - Why: This is consistent with the pre-PR behavior (the floor was 256 before), so it isn't a regression — but combined with the new public knob, it could read as "I set the cap to 64, why is the LLM stage still letting through 1024?" The
localandsubmissionresources do drop with the user's value; onlyllm_waitkeeps the floor. The new docstring onmax_in_flight_tasksdoesn't hint at that asymmetry. - Suggestion: Either (a) drop the
DEFAULT_TASK_POOL_SIZEterm from themax(...)(usemax(max_in_flight_tasks, 2 * aggregate)) so the user's setting always governs, or (b) add a one-line note to themax_in_flight_tasksdocstring that the model-stage admission floor stays at the engine default. (a) is cleaner if there's no specific reason to keep the floor when the user has opted in to a smaller cap.
Suggestions — Take it or leave it
packages/data-designer-engine/src/data_designer/engine/dataset_builders/async_scheduler.py:80 — Two names for the same constant
- What:
DEFAULT_TASK_POOL_SIZEis now defined asDEFAULT_IN_FLIGHT_TASK_CAPACITY. Both names refer to the same1024value but live in two modules. - Why: It's preserved as an alias (likely for downstream callers and tests that import it), but two names for one concept costs a bit of reading time when navigating the scheduler.
- Suggestion: As a follow-up, consider collapsing to a single name (
DEFAULT_IN_FLIGHT_TASK_CAPACITYreads well in context) once you're comfortable touching the in-tree import sites. Not blocking — and worth doing separately so this PR stays focused.
packages/data-designer-engine/tests/engine/dataset_builders/test_async_builder_integration.py:194 — Test name vs. what it now asserts
- What:
test_prepare_async_run_enables_request_pressure_advisorynow also assertsmax_in_flight_tasks/max_model_task_admissionpropagation. - Why: Searching for "what tests cover the new knob" won't surface this test by name. Both assertions are valid checks of the same code path, but the test name only describes the first.
- Suggestion: Either split into a focused
test_prepare_async_run_propagates_max_in_flight_tasksor rename to something liketest_prepare_async_run_propagates_scheduler_kwargs. Tiny ergonomics nit.
packages/data-designer-config/src/data_designer/config/run_config.py:175-182 — Field(description=...) duplicates the Attributes: docstring
- What: The class docstring already has a full
max_in_flight_tasks:entry, andField(description=...)repeats a shorter version of the same text. - Why: Other simple numeric fields in
RunConfig(e.g.buffer_size,non_inference_max_parallel_workers) don't carry adescription=. The redundancy is harmless but slightly inconsistent with the surrounding pattern. - Suggestion: Either drop the
description=(keep justField(default=1024, ge=1)) or, if you'd rather keep schema-level descriptions, leave it as-is. Pure style call.
What Looks Good
- The new knob's docstring is the standout — it explicitly disambiguates
max_in_flight_tasksfrommax_parallel_requests, which is exactly the confusion this change could otherwise introduce. - Coverage is layered well: config-level validation (
test_run_config_*), interface persistence (test_run_config_setting_persists), and builder propagation (the integration test). Each layer asserts the right thing without overlap. - The capacity-source flip from
dataset_builder→run_configonsubmission_capacityis a small but accurate provenance change, and it's still type-safe underCapacityValueSource. - Renaming
max_submitted_tasks→max_in_flight_tasksis a real readability win — "submission" was just one phase, and the new name captures what the cap actually governs.
Verdict
Ship it (with nits) — the one Warning is non-blocking (a behavior nuance worth either tweaking or documenting), and the rest are pure style. Nothing critical here.
9c7d4d2 to
eccce53
Compare
johnnygreco
left a comment
There was a problem hiding this comment.
@eric-tramel I rechecked the review feedback against the current PR head. The substantive/actionable items look addressed: max_model_task_admission now follows max_in_flight_tasks directly, the old aggregate/floor behavior is gone, and the focused verification passed on my side. The remaining notes are optional nits/follow-ups, not blockers. Approving.
Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
eccce53 to
a0fa27c
Compare
johnnygreco
left a comment
There was a problem hiding this comment.
Re-checked after the branch update / conflict resolution. The scheduler task-cap behavior still matches the previously reviewed version: max_model_task_admission follows max_in_flight_tasks directly, and the focused propagation coverage is still in place. Re-approving.
📋 Summary
Adds a public
RunConfig.max_in_flight_taskssetting for the async scheduler's task-lease capacity and raises the default from 256 to 1024. The configured value is now the singular scheduler work cap: model-stage task admission and adaptive row-group queue backpressure follow it, while model API request concurrency remains controlled separately bymax_parallel_requestsand request admission.🔗 Related Issue
N/A
🔄 Changes
RunConfig.max_in_flight_taskswith a default of 1024 and validation that it is at least 1.AsyncTaskScheduler.max_in_flight_tasksdirectly, without the old default floor or aggregate request headroom.DEFAULT_IN_FLIGHT_TASK_CAPACITYfor this concept.max_in_flight_tasks, rather than an inflated model-stage admission value.run_config.🧪 Testing
make testpasses (not run; focused tests below)Focused checks run after rebasing onto latest
origin/main:uv run --group dev pytest packages/data-designer-config/tests/config/test_run_config.py packages/data-designer-engine/tests/engine/dataset_builders/test_async_scheduler.py packages/data-designer-engine/tests/engine/dataset_builders/test_async_builder_integration.py packages/data-designer-engine/tests/engine/dataset_builders/test_dataset_builder.py::test_run_config_default_non_inference_max_parallel_workers packages/data-designer/tests/interface/test_data_designer.py::test_run_config_setting_persists(110 passed)uv run ruff format --check ...on touched filesuv run ruff check --output-format=full ...on touched filesgit diff --check✅ Checklist